#!/usr/bin/env python
"""
This python script will download the dependencies for the build. Normally it used to facilitate component builds.

Dependencies are listed in a text file and are, by default, checked out to a directory called "dependencies".
The location of this directory can be overridden by a command-line argument.

fetchDependencies (or fetch) assumes that it is called from a 'bootstrap' folder in an existing SVN project.
It further assumes that the dependencies are from the same repository. While it can manage
if these assumptions are not true, these are considered exceptions.

A typical call is 'python bootstrap/fetchDependencies.py'.

Examples:
    To get usage info...
    > python bootstrap/fetchDependencies.py -h

    To get default dependencies...
    > python bootstrap/fetchDependencies.py

    To set custom dependency definition file and output directory...
    > python bootstrap/fetchDependencies.py -f dependencyListHead.txt -d dependencyHead

    To fetch unmodified dependencies even if there are local modifications...
    > python bootstrap/fetchDependencies.py -i

    To fetch with verbose output...
    > python bootstrap/fetchDependencies.py -v

    To create a "tagged" version of the dependency definition file...
    > python bootstrap/fetchDependencies.py -t taggedList.txt

    To check out on-repo dependencies at a specific revision
    > python bootstrap/fetchDependencies.py -r 123456


**************************************************************************************************************
*
*    DO NOT COMMIT TO BOOTSTRAP.
*    (Discuss changes with the CI Team).
**************************************************************************************************************


Project Layout
----------------------------------
Fetch assumes that it is used in a typical project layout, for the purpose of checking out code that is
ultimately part of same project.

The project layout is:

project/
  dependencyList.txt
  ...
  bootstrap/
    fetchDependencies.py
    ...


The Dependency Definition File and the Master Dependency Definition File
----------------------------------------------------------------------------
The Dependency Definition File, supplied by the project, lists the names of all required dependencies, 1 per line.
The default Dependency Definition File is named dependencyList.txt, as shown in the project layout above.

# Dependency Definition File (dependencyList.txt)
master-dependencies https://url-to-master
dep1
dep2

Each name should have a match in the Master Dependency Definition File. This file is downloaded from a svn location,
specified in the Dependency Definition File (master-dependencies in the sample directly above.).

The Master Dependency Definition File lists the names of all dependencies and assigns them a url. The format of the url
is compatible with svn:externals.

# Master Dependency Definition File (at https://url-to-master)
dep1 https://url-to-dep1
dep2 https://url-to-dep2
...

The Dependency Definition File may also assign a url to a dependency. In this scenario the url in the Master Definition
File is ignored for that dependency. See override below.

# Dependency Definition File (dependencyList.txt)
master-dependencies https://url-to-master
dep1
dep2
override https://ignore-master-url.for/override


Relating Dependencies to Project
----------------------------------
Fetch assumes it is checking out parts of the same project, i.e., jabber-all. Therefore, it tries to
tie all dependencies to the main project. This affects the url and revision for check-out dependencies.

A dependency is tied to a project if the first part of the directory structure matches it.

For example, an SVN project : https://wwwin-svn-bxb.cisco.com/jabber-all/jabber/
will match any dependency where the directory starts with  'jabber-all'.

This assumes that changes in the network location represent svn mirrors.

Note that it also ties branches on the same repository together. For example, the following will be tied:
    https://wwwin-svn-bxb.cisco.com/jabber-all/jabber/trunk/project
    https://wwwin-svn-bxb.cisco.com/jabber-all/jabber/branches/dependency

On Jenkins, a situation can arise where Jenkins and fetch use different versions of SVN client.
 When this happens, fetch cannot query the project using svn client. Here, it will fall back to
 using SVN_REVISION and SVN_URL provided by the Jenkins job and subversion plugin. (And the first one
 in the case where multiple repos are provided).


Fetch Output
-----------------------------
After running fetchDependencies.py (with say scripts as the dependency) , a typical project looks as follows:

project/
  dependencyList.txt
  ...
  bootstrap/
    fetchDependencies.py
    ...
  dependencies
    scripts
    ...


Local Changes to Dependencies
----------------------------
Where possible, don't modify dependencies. (Consider working without fetch dependencies).

If fetchDependencies finds a modification to a dependency, its default behaviour is to exit without updating
the dependencies. In interactive mode the program can still update non-modified dependencies.

In server mode, fetch will delete local modifications. This is useful on Jenkins or if a dependency SVN checkout
ends up in a bad state.


Selecting the Dependency Revision
---------------------------------
The default dependency SVN revision is tied to the project.  If a dependency is from the same repository as
the project, then it will be updated to whatever revision the project is at.

A particular revision can be specified at the command line, using the '--revision= ' option. HEAD can be passed
 as a REVISION here. (This provides the old default behaviour).

If a revision is 'pegged' in either dependency file, then this overrides the project's revision.

Example - To get a particular revision of scripts you could specify either of the following in the
Dependency Definition File

scripts -r2211 https://wwwin-svn-sjc.cisco.com/core-grp/csf-core/scripts/trunk
scripts https://wwwin-svn-sjc.cisco.com/core-grp/csf-core/scripts/trunk@2211

In cases where the dependency is from a different repository, pegging is suggested, as otherwise 'HEAD' is used and
fetch's output is time-dependent and non-reproducible.


Dependency Network Location
-------------------------------------------
By Default, the 'project' folder sets both the network location and the revision of all the dependencies. For example,
if the project is checked-out from "wwwin-svn-sjc.cisco.com", the dependencies will also be
checked out from  wwwin-svn-sjc.cisco.com. As discussed above, the revisions will also match.

So, although the dependency URL specified in the Dependency File contains the network location
(e.g. wwwin-svn-bxb.cisco.com), the project's networks location will override this. This behaviour can be turned off
by selecting '--useDependencyListUrls' when calling fetch. (This behaviour allows implicit and simple repo selection).


"""

import sys
import os
import time
import copy
import subprocess
import tempfile
import optparse
import threading
import urlparse
import Queue
from threading import Thread
from datetime import datetime
from ftplib import FTP

STARS = "****************************************************"
TAB = "    "
MASTER_DIR = "master-dependencies"
BASE_URL_SPECIFIER = "repo_base"
METADATA_FILENAME = "metaData.txt"
TESTS_BREAK = "#--tests--"

ENVIRONMENTAL_VARIABLE = "FETCH_DEPENDENCIES"

# Global configuration variables which are set by the user in main()
verbose = None
server_mode = None
SVN = None
interactiveMode = None


class FetchException(Exception):
    pass


def _parse_line_info(line):
    """
    Parse out the svn info for a given dependency line.
    The string passed in is expected to be in the format used in the dependencyList.txt file.

    """

    if not line:  # Possible that dependency is not in the master list or it's master!
        return '', '', ''
    tokens = line.split()
    name = tokens[0]

    if len(tokens) < 2:
        rev = ""
        url = ""
    elif tokens[1].startswith('-r'):
        rev = tokens[1].strip()[2:]
        url = tokens[2].strip()
    else:
        index = tokens[1].strip().find('@')
        if -1 == index:
            url = tokens[1].strip()
            rev = ''
        else:
            url = tokens[1].strip()[:index]
            rev = tokens[1].strip()[index+1:]
    url = url.rstrip('/ ')  # normalize url to prevent unnecessary checkouts
    return name, url, rev


class Dependency(object):
    """
    Class that represents a dependency line.
    Manages the retrieved revision and repo, which may change from the raw file entry.
    
    """
    project_url = None
    target_revision = None
    updated_revisions = set()
    translate_urls = True


    def __init__(self, dependency_list_line, master_line=''):
        master_name, master_url, master_rev = _parse_line_info(master_line)
        local_name, local_url, local_rev = _parse_line_info(dependency_list_line)

        if master_name and master_name != local_name:
            raise Exception("Error Combining Master List and local Dependency List")
        self._name = local_name
        if local_url:
            self._url = local_url
            revision = local_rev
        else:
            self._url = master_url
            revision = master_rev
        if revision:
            self._revision = revision
        else:
            self._revision = None
        if not self._url:
            raise Exception("Error Parsing Depenency: Url not parsed: " + dependency_list_line + ", " + master_line)

        self._is_ftp = self._url.startswith('ftp')
        self._updated = False

    def _convert_repo(self):
        """
        Converts the passed dependency line's SVN net location to use the globally declared
         project's net location.

         For example:
         project from https://wwwin.svn-bxb.com/jabber-all/trunk/project
         dependency_line is: name https://wwwin.svn-sjc.com/jabber-all/trunk/dependency revision

         returns: name https://wwwin.svn-bxb.com/jabber-all/trunk/dependency revision

        """
        if self._is_dependency_url_from_project_repo():
            # replace the dependency line url net location using the project's url net location
            dependency_parsed_url = urlparse.urlparse(self._url)
            project_parsed_url = urlparse.urlparse(Dependency.project_url)
            url = urlparse.urlunparse(  # e.g. www.svn-sjc.cisco.com -> www.svn-bxb.cisco.com
                (dependency_parsed_url[0],         # This is a tuple, hence the (( ))
                 project_parsed_url.netloc,        # (and also why it cannot be changed)
                 dependency_parsed_url[2],
                 dependency_parsed_url[3],
                 dependency_parsed_url[4],
                 dependency_parsed_url[5]),
            )
        else:
            url = self._url
        return url

    @property
    def name(self):
        return self._name

    @property
    def url(self):
        if Dependency.translate_urls:
            return self._convert_repo()
        else:
            return self._url

    @property
    def revision(self):
        if self._revision:
            return self._revision
        if self._is_dependency_url_from_project_repo():
            return Dependency.target_revision
        else:
            return "HEAD"

    @property
    def revision_url(self):
        return "{}@{}".format(self.url, self.revision)

    @property
    def is_ftp(self):
        return self._is_ftp

    def __str__(self):
        if self.revision:
            return "{} {} {}".format(self.name, self.url, self.revision)
        else:
            return "{} {}".format(self.name, self.url)

    def _is_dependency_url_from_project_repo(self):
        """
        Check that the requested url comes from the project's repository or a mirror of same.

        So, we have replicated repos for jabber. But they are all called 'jabber-all'.
        If the repo is in the set of matching repos, and it starts the directory with jabber-all,
        then we assume the repo is a replica and treat it as being synchronized.

        If this fails, fetch will assume the dependency is not in same repo, and fetch head.
        """
        dependency_url = self._url
        if Dependency.project_url:
            project_parsed_url = urlparse.urlparse(Dependency.project_url)
            project_top_level_directory = project_parsed_url.path.split('/')[1]  #e.g. '/jabber-all/jabber -> 'jabber-all'
            dependency_parsed_url = urlparse.urlparse(dependency_url)
            dep_top_level_directory = dependency_parsed_url.path.split('/')[1]
            top_level_directory_match = project_top_level_directory == dep_top_level_directory
            return top_level_directory_match
        else:
            return False

    def update(self):
        """
        Updates the class variable with information about what version was updated

        """
        if self._revision:
            Dependency.updated_revisions.add('{}: {}'.format(self.name, self.revision))
        elif self._is_dependency_url_from_project_repo() and Dependency.target_revision:
            Dependency.updated_revisions.add('project: ' + self.revision)
        else:
            # Looks like the first clause, but it's possible to be a project dependency and
            # set the revision, but if it's not set and if it's not a project revision,
            # there's no point in working it out here!
            Dependency.updated_revisions.add('{}: {}'.format(self.name, self.revision))

    @classmethod
    def print_updates(cls):
        print('\nUpdated to revisions:\n\t' + '\n\t'.join(cls.updated_revisions))
        print('\n')


def getDependencyList(file_path):
    """
    Retrieve the list of ALL dependency names found in a file e.g. dependencyList.txt
    This function is called from project SConstruct files to build up the list of ALL dependencies.

    """

    dep_list = []

    dependencies = _read_dependency_file( file_path )

    for line in dependencies:

        if line.startswith( MASTER_DIR ) :
            continue # The SConstructs never want the master as a SConscript dependency or the test break

        # First word is the dependency name
        dep_name = line.split()
        dep_list.append(dep_name[0])

    return dep_list


def getDependencySubLists(file_path):
    """
    Retrieve the list of dependency names found in a file e.g. dependencyList.txt split into component and test lists
    This function is called from project SConstructors to build up the list of dependencies.

    """

    test_dep_list = []
    comp_dep_list = []

    dependencies = _read_dependency_file( file_path, True )

    processing_comp_deps = True;

    for line in dependencies:

        if line.startswith( MASTER_DIR ):
            continue # The SConstructs never want the master as a SConscript dependency

        if line.startswith( TESTS_BREAK ):
            processing_comp_deps = False # switch flag to project process tests
            continue # The SConstructs never want the test break so skip

        # First word is the dependency name
        dep_name = line.split()

        if processing_comp_deps:
            comp_dep_list.append(dep_name[0])
        else:
            test_dep_list.append(dep_name[0])

    return comp_dep_list, test_dep_list


def _set_defaults_from_environment(oparser):
    """
    Reads the FETCH_DEPENDENCIES environmental variable and parses its arguments. The result
    becomes the new defaults. But arguments can still be overridden from the command line. 
    
    e.g.
    Command line specifies value            "--dependencyFile commandFile.txt"
    FETCH_DEPENDENCIES specifies value      "--dependencyFile envFile.txt"
    --dependencyFile has hard coded default "dependencyFile.txt"
    
    In this scenario the argument will be set to commandFile.txt. But if not specified at the 
    command line the value in FETCH_DEPENDENCIES will be used. If that is not specified the
    hard coded default applies.
    
    """
    if ENVIRONMENTAL_VARIABLE in os.environ:
        environmental_args = os.environ[ ENVIRONMENTAL_VARIABLE ].split()
        options, _ = oparser.parse_args( environmental_args )
        oparser.defaults = options.__dict__


def _read_dependency_file(dependency_file_path, include_breaks = False):
    """
    Reads a dependency file (e.g. dependencyList.txt).
    All comments and empty lines will be removed. Only valid dependency lines will be returned.

    """

    dependencies = []

    try:
        dep_file = open( dependency_file_path )
    except:
        print >> sys.stderr, "Fatal Error: Cannot find the dependencies file - %s" % dependency_file_path;
        sys.exit(1) # Kill the script

    for line in dep_file:
        line = line.strip()
        if line.startswith(TESTS_BREAK) and include_breaks:
            dependencies.append(line)

        if not line.startswith('#') and line:
            dependencies.append(line)

    dep_file.close()
    return dependencies


class ThreadOutputInfo:
    """ Holds the console output for a thread of execution """

    def __init__(self, name):
        self.name = name # Name of dependency associated with the thread of execution
        self.msg = "" # Top most message i.e. The reason for logging the info
        self.messages = [] # All the messages to date
        self.errors = [] # All the errors to date
        self.done = False # Whether the thread of execution is finished


    def __str__(self):
        """
        Returns a string representing the message(s) to display on the console
        """

        # Note - messages already contains errors
        if verbose:
            out = ""
            for m in self.messages:
                out += "\n\t" + m

            return out + "\n"

        if self.errors:
            out = ""
            for m in self.errors:
                out += "\n\t" + m

            return out + "\n"

        return self.msg


    def __nonzero__(self):
        """Controls what happens if a consumer checks the bool value of the ThreadOutputInfo."""
        return len(self.errors)


    def copy(self):
        """Make a copy of the object"""
        copy_info = ThreadOutputInfo(self.name)
        copy_info.msg = copy.copy(self.msg)
        copy_info.messages = copy.copy(self.messages)
        copy_info.errors = copy.copy(self.errors)
        copy_info.done = self.done

        return copy_info


class ThreadOutput:
    """
    Stores the console output for a thread of execution.
    For every "log" the output will be queued where it can be accessed safely from another thread.

    """
    def __init__(self, name, threadQueue):
        self.info = ThreadOutputInfo(name)
        self.threadQueue = threadQueue


    def __del__(self):

        """Calls complete when the object is destroyed.
        This means the consumer does not need to remember to manually call complete().

        """

        self.complete()


    def __nonzero__(self):
        """
        Controls what happens if a consumer checks the bool value of the ThreadOutput.

        e.g.    out = ThreadOutput("name", Queue.Queue())
                out.write("My message")

                if out:
                    # won't be hit

                out.err("My error")

                if out:
                    # Will be hit

        """

        return len(self.info.errors)


    def write(self, msg):

        """Log some info from the thread of execution. This may then be accessed from another thread."""

        m = msg.rstrip() # Remove \r\n with rstrip()
        self.info.msg = m
        self.info.messages.append(m)

        # Queue the messages in a thread safe manner. Add a copy to the queue to guarantee the msg will be parsed.
        # Otherwise it could be overridden by the next event. e.g. output.write("msg1") & output.write("msg2") would
        # result in queue [{"name", info(..... msg = "msg2")}, {"name", info(..... msg = "msg2")}) - "msg1" is lost".
        # Also allows the consumer to rely upon __del__() to call complete().
        self.threadQueue.put({self.info.name : self.info.copy()})


    def verbose(self, msg):

        """Log verbose only information."""

        self.info.messages.append(msg.rstrip())

        # Only need to inform observing threads if in verbose mode
        if verbose:
            self.threadQueue.put({self.info.name : self.info.copy()})


    def err(self, msg):

        """Log error information. Errors indicate that the execution should terminate."""

        m = "Error: " + msg.rstrip()
        self.info.errors.append(m)
        self.info.messages.append(m)
        self.threadQueue.put({self.info.name : self.info.copy()})


    def complete(self):

        """Mark the thread of execution as complete. There is no more information to log."""

        if not self.info.done:
            self.info.done = True
            self.threadQueue.put({self.info.name : self.info.copy()})


def _remove_directory( output, directory ):

    """Remove the directory and all its contents.

    There have been multiple build issues stemming from directory removal. They
    occur when the directory is not deleted or only partially deleted.

    This will typically happen when another process is using it. e.g.
          - The user has the directory open
          - The user is editing a file located in the directory
          - TortoiseSVN has a handle to the .svn subdirectory (depends on TortoiseSVN version)

    This can cause a project's build to fail. i.e. Because the wrong files are on disk
    or some files are missing.

    """

    if os.path.exists(directory):

        output.write( "Deleting directory %s" % directory )

        # Build the delete command
        if 'win32' == sys.platform:
            cmd = 'RMDIR ' + directory + ' /s /q'
        else:
            cmd = 'rm -rf ' + directory

        so = subprocess.Popen(cmd, shell=True, stderr=subprocess.PIPE)

        err = False
        process_access_err = False
        for line in so.stderr:
            if not err:
                err = True
                output.err("Failed to delete directory %s" % directory)

            output.err( line )

            if 'win32' == sys.platform and "another process" in line:
                process_access_err = True

        if process_access_err:
            output.err("Fatal - partial delete likely")
            output.err("Possible Cause - user is editing a file in the directory or has the directory opened")
            output.err("Possible Cause - TortiseSvn has a handle to the .svn folder - kill TortoiseProc to resolve - TortoiseProc is used by Show log, Repo-browser, etc")
            output.err("Suggestion - Kill other process & manually delete %s" % directory)

        if not err:
            output.write( "Deleted directory %s" % directory )


def _create_dependency_dir( dependencyDir ):
    """
    Creates the "dependencies" directory.
    If not already in the svn:ignore list of the project, it is added.

    """

    os.mkdir( dependencyDir )
    cmd = SVN + ['propget', 'svn:ignore', '.']

    so = subprocess.Popen(cmd, stdout = subprocess.PIPE)

    addIgnore = True
    ignoreFile = tempfile.NamedTemporaryFile(suffix='.tmp', delete=False)
    ignoreFile.seek(0)
    for line in so.stdout:
        ignoreFile.write(line)
        if line.strip() == dependencyDir:
            addIgnore = False

    if addIgnore:
        ignoreFile.write(dependencyDir)
        ignoreFile.flush()
        cmd = SVN + ['propset', 'svn:ignore', '-F', ignoreFile.name, '.']
        subprocess.call(cmd)
        

def _time_lapse( start_time ):

    """Calculate the time that has passed since the start_time and format it to be human readable"""

    current_time = datetime.now()
    elapsed_time = current_time - start_time
    return "%s minutes %s seconds" % (elapsed_time.seconds // 60, elapsed_time.seconds % 60 )

def _time_stamp():

    """Create a UTC time stamp."""

    return "%s" % time.strftime( '%d-%b-%Y %I:%M:%S %p', time.localtime( time.time() ) )



def _get_full_svn_revision_url( url, rev):
    """
    Return a consistent svn url format

    """
    if rev in ['', 'HEAD']:
        return url
    return url + '@' + rev


def _dots(dotdotdot):
    """ Simple utility to get the .... in-progress indicator."""

    if len(dotdotdot) > 10:
        return "" # Reset

    if len(dotdotdot) > 9:
        return dotdotdot + " "  # Handle a draw issue where the 10th . remains

    return dotdotdot + "."


def _fetch_ftp(printQueue, dependency_dir, targetDir, url):
    output = ThreadOutput(os.path.basename(os.path.normpath(targetDir)), printQueue)

    directory = os.path.abspath(os.path.join(dependency_dir, targetDir))

    # Create dependency directory
    if not os.path.isdir(directory):
        os.makedirs(directory)

    # Create metaData.txt and add ftp url
    metafile = os.path.join(directory, METADATA_FILENAME)
    if not os.path.isfile(metafile):
        with open(metafile, 'w+') as f:
            f.write(url)

    output.write("Fetching from FTP location %s to %s\n" % (url, targetDir))
    # Parse out FTP host and directory
    url = url.replace('ftp://', '')
    url_tokens = url.split('/', 1)
    host = ''
    remote_path = ''
    if len(url_tokens) > 1:
        host = url_tokens[0]
        remote_path = url_tokens[1]
    else:
        host = url_tokens[0]
    ftp = FTP()
    try:
        ftp.connect(host)  # Connect to ftp server
        ftp.login('anonymous', '@anonymous')  # Anonymous Login
        ftp.cwd(remote_path)
        _retrieve_ftp_directory(ftp, directory, '')  # Download remote directory contents
        ftp.quit()
    except Exception, e:
            output.err('FTP: ' + str(e))  # Print out any FTP errors

    output.write("Fetched")


def _retrieve_ftp_directory(ftp, parent_directory, directory):

    """
    Checks if contents of remote working directory exists locally, downloads to local working directory if contents do not exist

    """

    targetDir = os.path.join(parent_directory, directory)

    if not os.path.isdir(targetDir):  # Create if doesnt exist
        os.makedirs(targetDir)

    ftp.cwd(directory)  # Change into remote folder

    sub_directories, files = _list_ftp_directory(ftp)  # Get the sub directores and files of the remote working directory

    for file in files:  # Download each file if it doesnt already exist locally
        localFilePath = os.path.join(targetDir, file)
        if not os.path.isfile(localFilePath):
            local_file = open(localFilePath, 'wb')
            ftp.retrbinary('RETR ' + file, local_file.write)
            local_file.close()

    for sub_directory in sub_directories:  # Create each remote sub directory locally and call retrieve_ftp_directory to download contents
        _retrieve_ftp_directory(ftp, targetDir, sub_directory)  # Retrieve contents

    ftp.cwd('..')  # move back to parent remote directory


def _list_ftp_directory(ftp):
    """
    Returns the files and directories of the remote working directory

    """

    items = []

    ftp.dir(items.append)  # Get all the items from the current remote working directory

    sub_directories = []
    files = []
    for item in items:
        item_tokens = item.split()
        item_details = item_tokens[0]  # First token is the permissions which can be used to determine if folder or file
        item_name = item_tokens[-1]  # Last token is the name of the item

        if item_name in ['..', '.']:  # skip the parent direcory and current directory shortcuts
            continue
        elif item_details.upper().startswith('D'):  # permissions start with a 'd', if item is a directory
            sub_directories.append(item_name)
        else:
            files.append(item_name)  # if not a directory, must be a folder

    return sub_directories, files


def _check_ftp_url(print_queue, dependency_dir, name, url):
    """
    Checks if the specified url matches the url in metaData.txt of the dependency folder
    If url's dont match, deletes the dependency folder to trigger a download

    """

    output = ThreadOutput(name, print_queue)
    output.write("Checking FTP...")

    targetDir = os.path.join(dependency_dir, name)

    if not os.path.exists(targetDir):
        output.write("Directory does not exist")
        return
    else:
        # Check if url in metaData.txt matches url
        localUrl = ''
        metafile = os.path.join(targetDir, METADATA_FILENAME)
        if os.path.isfile(metafile):
            with open(metafile, 'r') as f:
                localUrl = f.readline()

        if localUrl == url:
            output.write("Fetching any missing files from FTP")
        else:
            _remove_directory(output, targetDir)  # Delete the directory if the url's dont match
            output.write("Dependency urls don't match")
            output.verbose("Detected %s but require %s" % (localUrl, url))


def _svn_checkout(printQueue, url, targetDir):
    """
    Checkout the svn repository from the url to the targetDir.
    If there is a required revision it should be specified by placing a '@'revision at the end.

    """
    output = ThreadOutput(os.path.basename(os.path.normpath(targetDir)), printQueue)
    _remove_directory(output, targetDir)
    if output:
        return

    output.write( "Checking out %s to %s\n" % (url, targetDir) )
    cmd = SVN + ['checkout', url, targetDir]
    so = subprocess.Popen(cmd, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
    for line in so.stdout:
        output.write(line)
    for line in so.stderr:
        output.err(line)
    output.write("Checked out")


def _svn_update( printQueue, targetDir, revision ):
    """
    Updating the svn check out.
    If there is a required revision, it should be specified by placing a '@'revision at the end

    """

    output = ThreadOutput(os.path.basename(os.path.normpath(targetDir)), printQueue)

    output.write( "Updating %s" % targetDir )

    cmd = SVN + ['update', targetDir, '-r', revision]

    so = subprocess.Popen(cmd, stdout = subprocess.PIPE, stderr = subprocess.PIPE)

    for line in so.stdout:
        output.write(line)

    for line in so.stderr:
        output.err(line)

    output.write( "Updated to revision %s" % revision )


def _svn_get_info(output, svnDir):
    """
    Get the url, revision, and last changed revision for an existing checkout.
    The revision numbers will be returned as Int(..). Strings can cause subtle
    bugs with the less than (<) and greater than (>) operators.

    """
    cmd = SVN + ['info', svnDir]

    so = subprocess.Popen(cmd, stdout = subprocess.PIPE)

    infoUrl = ""
    infoRev = 0
    infoLastChangedRev = 0
    for line in so.stdout:
        if line.startswith("URL: "):
            infoUrl = line.split(" ")[1].strip()
            infoUrl = infoUrl.rstrip('/ ') # normalise url to prevent unnecessary checkouts
        elif line.startswith("Revision:"):
            infoRev = int(line.split(":")[1].strip())
        elif line.startswith("Last Changed Rev:"): # Last Change Rev may not exist i.e. If a folder is not found
            infoLastChangedRev = int(line.split(":")[1].strip())

    output.write( "Revision: %s LastChange: %s Uri: %s" % (infoRev, infoLastChangedRev, svnDir) )

    return (infoUrl, infoRev, infoLastChangedRev)



def _get_jenkins_svn_revision_info():
    """
    As a fallback to querying the project, Jenkins can provide SVN information.
    Done here. Works on Jenkins where the SVN Plugin is used.

    """
    jenkins_svn_revision, jenkins_svn_url = None,None
    for sr,su in [('SVN_REVISION','SVN_URL'), ('SVN_REVISION_1', 'SVN_URL_1')]:
        if sr in os.environ:
            jenkins_svn_revision = os.environ[sr]
            jenkins_svn_url = os.environ[su]
    return jenkins_svn_url, jenkins_svn_revision, 0


def _get_project_svn_info():
    """
    Calculates the SVN Revision of the user of 'fetchDependencies'

    This uses the heuristic of the _directory_ above fetchDependencies.

    It is assumed the 'fetchDependencies' is in 'bootstrap', located in an SVN folder.
    If this fails, the revision number must be specified on the command line.
    """

    project_dir = os.path.join(os.path.dirname(__file__), '..')
    # output is required to met _svn_get_info api.
    q = Queue.Queue()
    output = ThreadOutput('svn query', q)
    url, revision, last_changed = _svn_get_info(output, project_dir)
    if (urlparse.urlparse(url).netloc == ''
            or not int(revision) > 0
            or not int(last_changed) > 0):
        # We can try using Jenkins variables...
        # The main reason for this fail is that Jenkins svn and the slave's
        # are not at the same version.
        url, revision, last_changed = _get_jenkins_svn_revision_info()
        if not url:
            raise FetchException('unable to find project svn information')
    return url, str(revision), last_changed


def _svn_check_for_local_mods( output, targetDir ):

    """Check a svn checkout for local modifications."""

    output.write("Checking for local modifications in %s" % targetDir)

    cmd = SVN + ['stat', '-q', '--ignore-externals', targetDir]

    so = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    updatesPresent = False
    for line in so.stdout:
        if line.startswith('X'):
            continue # some versions of svn seem to always print externals.
        output.verbose( "%s" % line )
        updatesPresent = True

    for line in so.stderr:
        output.err( "%s" % line )

    output.write("No local modifications" if not updatesPresent else "Local modifications detected")

    return updatesPresent


def _establish_svn_download(print_queue, checkout_queue, update_queue, edit_list, dependency_dir, dependency):
    """
    The dependency is already on disk. Check for updates or changes to the svn url i.e.
    New revision or repository address.

    """
    output = ThreadOutput(dependency.name, print_queue)
    output.write("Checking..." + str(dependency))

    # Update that we are planning to check out this revision
    dependency.update()

    targetDir = os.path.abspath(os.path.join(dependency_dir, dependency.name))

    # If we don't already have the dependency, we just download...job done!
    if not os.path.exists(targetDir):
        checkout_queue.put({targetDir:dependency.revision_url})
        output.write("Directory does not exist")
        return
    else:
        # Check for local changes - so we always inform user that they are there
        if _svn_check_for_local_mods(output, targetDir):
            edit_list.append(targetDir)

        if output:
            # If the output is true there were errors so return
            return

        # If there are local modifications in server mode do a fresh checkout
        if server_mode and targetDir in edit_list:
            checkout_queue.put({targetDir:dependency.revision_url})
            output.write("Dependency has local modifications")
            return

        if output:
            return

        localUrl, localRev, local_last_change = _svn_get_info(output, targetDir)

        if output:
            return

        if localUrl != dependency.url:
            checkout_queue.put({targetDir:dependency.revision_url})
            output.write("Dependency urls don't match")
            output.verbose("Detected %s but require %s" % (localUrl, dependency.url))
            return
        else:
            # Check the last time url@revision changed.
            _, _, remoteLastChangedRev = _svn_get_info(output, dependency.revision_url)
            if output:
                return

            if remoteLastChangedRev != local_last_change:
                output.write("Need to update to revision %s from %s" % (remoteLastChangedRev, local_last_change));
                update_queue.put({targetDir:dependency.revision})
                return

    output.write("Dependency is up-to-date (last change:%s)" % str(remoteLastChangedRev))


def _wait_on_threads(thread_output_queue, start_time, total_start_time):

    """Waits for the dependency threads to complete execution.
    Based on the result this method will either continue or terminate the script.

    """

    refresh_rate = 0.25
    dotdotdot = ""
    dependency_output = {}

    # Wait until "this" is the only active thread.
    # If "this" is already the only active thread check that printQueue is not empty.
    # i.e. The threads may finish execution before this point but the code should still
    # print the info to console.
    while threading.active_count() > 1 or not thread_output_queue.empty():
        time.sleep(refresh_rate)

        while not thread_output_queue.empty():
            pair = thread_output_queue.get() # Queue's are thread safe
            key = pair.keys()[0]
            value = pair[key]

            if value.done:
                # Thread execution has finished
                dependency_output[key] = value
                # Note - \r causes the last line to be rewritten and only works when used with sys.stdout.write
                sys.stdout.write( ("\r" + TAB + "{0: <35}{1: <10}\n").format(key, value) )
                sys.stdout.flush()

        if not server_mode:
            dotdotdot = _dots(dotdotdot)#
            sys.stdout.write( "\r{0: <15}{1: <10}".format(_time_lapse(start_time), dotdotdot ) )
            sys.stdout.flush()

    sys.stdout.write( "\rTime (%s)\n" % _time_lapse(start_time) )
    sys.stdout.flush()

    # If there are errors the script should terminate
    error_count = 0
    for dep in dependency_output.keys():
        outcome = dependency_output[dep]

        if outcome.errors:
            error_count += 1
            print >> sys.stderr, "\n%s" % dep
            for error in outcome.errors:
                print >> sys.stderr, (TAB + "%s") % error

    if error_count > 0:
        print >> sys.stderr, ("\n\n" + STARS + "\n\n%s Error(s) - terminating\n%s\n\n" + STARS + "\n") % ( error_count, _time_lapse(total_start_time) )
        sys.exit(1) # Exit script in error


def _fetch_dependencies(dependency_list, dependencyDir):
    """
    Check the existing dependencies and fetch any missing or updated dependencies.

    The fetch dependencies is two phased.

    Phase 1:

        Create the check out directory if it does not already exist.
        Check each dependency from the dependency definition to determine if it needs to be checked out or updated.
        Check if existing dependencies have local modifications.
        Terminate/Prompt the user if there are local modifications in a dependency that needs to be checked out/updated.
        Note: In some scenarios the script will do a new checkout instead of updating an existing dependency i.e. Url changes.

    Phase 2:

        Check out or update non-conflicting dependencies.

    """
    total_start_time = datetime.now()

    # If the externals directory doesn't exist, create it, and set it to ignore
    if not os.path.exists(dependencyDir):
        _create_dependency_dir(dependencyDir)

    # Ensure that each dependency name has an associated url
    error_count = 0
    for dep in dependency_list:

        if not dep.url:
            error_count += 1
            print >> sys.stderr, "\n%s" % dep.name
            print >> sys.stderr, (TAB + "A repository url could not be found in either the dependency or master list")

    if error_count > 0:
        print >> sys.stderr, ("\n\n" + STARS + "\n\n%s Error(s) - terminating\n\n" + STARS + "\n") % ( error_count )
        sys.exit(1)

    checkout_queue = Queue.Queue()
    update_queue = Queue.Queue()
    thread_output_queue = Queue.Queue()
    modified_dependencies = []

    start_time = datetime.now()

    #Check for ftp dependencies, put into into ftp_dependencies, remove from dependency_list
    ftp_list = [d for d in dependency_list if d.is_ftp]
    # Create a Dictionary
    ftp_dependencies = dict()
    for ftp in ftp_list:
        ftp_dependencies[ftp.name] = ftp.url

    # Remove the ftp dependencies from the list
    dependency_list = [d for d in dependency_list if not d.is_ftp]

    # Phase 1 - kick off a thread to check the "update" state of each dependency
    for line in dependency_list:
        t = Thread(target=_establish_svn_download, args=(thread_output_queue, checkout_queue, update_queue, modified_dependencies, dependencyDir, line))
        t.start()

    # Check the ftp meta data
    for name in ftp_dependencies:
        t = Thread(target=_check_ftp_url, args=(thread_output_queue, dependencyDir, name, ftp_dependencies[name]))
        t.start()

    print "Checking"

    _wait_on_threads(thread_output_queue, start_time, total_start_time)

    # Get the folder name with os.path.basename(os.path.normpath(..))
    print "\n\nFetching from FTP %s" % [ os.path.basename(os.path.normpath(i)) for i in ftp_dependencies.keys() ]

    checkout_dependencies = {}
    while not checkout_queue.empty():
        checkoutPair = checkout_queue.get()
        # Get first and only key value pair in list
        checkout_dependencies[checkoutPair.keys()[0]] = checkoutPair[checkoutPair.keys()[0]]

    # Get the folder name with os.path.basename(os.path.normpath(..))
    print "\n\nChecking out %s" % [ os.path.basename(os.path.normpath(i)) for i in checkout_dependencies.keys() ]

    update_dependencies = {}
    while not update_queue.empty():
        updatePair = update_queue.get()
        update_dependencies[updatePair.keys()[0]] = updatePair[updatePair.keys()[0]]

    print "\n\nUpdating %s" % [ os.path.basename(os.path.normpath(i)) for i in update_dependencies.keys() ]

    if 0 < len(modified_dependencies):
        print "\n\nLocal edits %s" % [ os.path.basename(os.path.normpath(edit)) for edit in modified_dependencies ]

    print "\n\n"

    if not server_mode:
        # SVN Conflicts
        conflicted    = [ co for co in dict(checkout_dependencies.items() + update_dependencies.items()) if co in modified_dependencies ]
        notConflicted = [ nc for nc in dict(checkout_dependencies.items() + update_dependencies.items()) if nc not in modified_dependencies ]

    else:
        conflicted    = [] # In server mode there are no conflicts as modified dependencies are deleted and checked out
        notConflicted = [ nc for nc in dict(checkout_dependencies.items() + update_dependencies.items()) ]

    if 0 < len(conflicted):
        print >> sys.stderr, \
            "ERROR: There are local modifications in directories that need to be updated.\n"\
            + "Please revert, or manually update, the following checkouts, and try again."

        for dep in conflicted:
            print >> sys.stderr, "    %s" % dep

        if not interactiveMode and not 0 == len(notConflicted):
            print "\nYou may also re-run in interactive mode to update non-conflicted dependencies"
            print "Re-run with --interactive and follow instructions"
            sys.exit(1)

        if 0 == len(notConflicted):
            sys.exit(1)

        print "\n\nDo you want to update the 'following' non-conflicted dependencies"
        for nc in notConflicted:
            print (TAB + "    %s") % nc

        tell_user = raw_input("Enter 'yes' or 'no' > ")
        if not tell_user.strip().lower() in ["yes", "y"]:
            sys.exit(1)

        print "\n\n"

    if not checkout_dependencies and not update_dependencies and not ftp_dependencies:
        print (STARS + "\n\nSuccess - Dependencies are already up-to-date \n%s\n\n" + STARS + "\n") % _time_lapse(total_start_time)
        return

    thread_output_queue = Queue.Queue()
    start_time = datetime.now()

    # Phase 2 - checkout or update the dependencies
    for key in checkout_dependencies.keys():
        if key in notConflicted:
            t = Thread(target=_svn_checkout, args=(thread_output_queue, checkout_dependencies[key], key))
            t.start()

    for key in update_dependencies.keys():
        if key in notConflicted:
            t = Thread(target=_svn_update, args=(thread_output_queue, key, update_dependencies[key]))
            t.start()

    for key in ftp_dependencies.keys():
        t = Thread(target=_fetch_ftp, args=(thread_output_queue, dependencyDir, key, ftp_dependencies[key]))
        t.start()

    print "Downloading\n"
    _wait_on_threads(thread_output_queue, start_time, total_start_time)
    print ("\n\n" + STARS + "\n\nSuccess - All Dependencies have been downloaded \n%s\n\n" + STARS + "\n") % _time_lapse(total_start_time)





def _assemble_dependency_list(dependency_list, master_list):
    """
    Merges the contents of the dependency list (e.g. dependencyList.txt) and the master list (e.g. masterList.txt).
    If a url is defined in the dependency list it will be used. But if the dependency url is left blank then the url found in
    the master will be used.

    e.g.

        Dependency File                                        Master File
        boost                                                  boost                http://boost.org
        libcurl                                                libcurl              http://libcurl.com
        csf2g-foundation    http://csf2g.cisco.com             csf2g-foundation     http://foundation.cisco.com

    Will result in an assembled dependency list of

        boost               http://boost.org
        libcurl             http://libcurl.com
        csf2g-foundation    http://csf2g.cisco.com

    """
    dependencies = {}
    dependency_list_dict = {}
    master_list_dict = {}
    for line in dependency_list:
        name = line.split()[0]
        dependency_list_dict[name] = line
    for line in master_list:
        name = line.split()[0]
        master_list_dict[name] = line

    for name in dependency_list_dict:
        master_line = master_list_dict.get(name,'')
        dependencies[name] = Dependency(dependency_list_dict[name], master_line)

    return dependencies.values()


def _tag_dependency(print_queue, tagged_dependencies, dependency_dir, name):
    """
    Checks the local disk to determine the svn url and revision of a dependency
    """

    output = ThreadOutput(name, print_queue)
    output.write("Tagging...")

    targetDir = os.path.abspath(os.path.join(dependency_dir, name))

    tag_url, tag_rev, _ = _svn_get_info(output, targetDir )

    # Must use the revision instead of the last change revision.
    #
    # The fetch checks out code like so:
    #     svn checkout url@revision.
    #
    # With the revision this format always works. But if the last change revision is used
    # it can lead to a problem.
    #
    # Example: Create a new branch for a dependency. This results in revision X.
    # Checking out revision X or > X works. But checking out a revision < X fails:
    #     i.e. svn: E170000: URL '<uri>' doesn't exist
    # Which is sensible. And since the last change revision is < X it too fails.
    #
    # But..., to make things interest, this is not a problem if we checkout like so:
    #     svn checkout url -r revision
    #
    # With this syntax using the last change revision succeeds. Why? Subersion ignores
    # the branch. Instead it checks out the trunk/pre-branch url. That's dangerous.
    # Developers could easily make modifications and commit without realising the wrong
    # repository was used.
    #
    # Cons: The tagged revision may not correspond to a revision in the repository.
    # i.e. Some dependencies share a repository and changes increments the global revision.
    # This makes the tag more difficult to understand.
    tag = _get_full_svn_revision_url(tag_url, str(tag_rev))

    tagged_dependencies[name] = tag

    output.write("Tagged %s" % tag)


def _create_tagged_file(dependency_list, dependencyDir, tagged_file_path):
    """
    Create a tagged version of the dependency list.
    A tagged dependency list is required for reproducible builds. The tagged file lists each dependency and points at
    the revision checked out to the local disk.

    e.g.
        scripts https://wwwin-svn-sjc.cisco.com/core-grp/csf-core/scripts/trunk@2211
        ......

    """
    start_time = datetime.now()
    thread_queue = Queue.Queue()
    tagged_dependencies = {}
    names = []

    for line in dependency_list:
        name, _,  _ = _parse_line_info(line)
        names.append(name)
        _tag_dependency(thread_queue, tagged_dependencies, dependencyDir, name)

    try:
        tag_file = open(tagged_file_path, 'w')
    except:
        print >> sys.stderr, "Fatal Error: Cannot create the tagged file - %s" % tagged_file_path;
        sys.exit(1) # Kill the script

    first = True

    for name in names:
        if not first:
            tag_file.write("\n")
        else:
            tag_file.write("# Tagged on: %s\n" % _time_stamp())
            first = False

        tag_file.write(name + " " + tagged_dependencies[name])

    tag_file.close()

    print ("\n\n" + STARS + "\n\nSuccess - Dependencies have been tagged \n%s\n\n" + STARS + "\n") % _time_lapse(start_time)


def main():

    usage =  ('fetchDependencies.py -f dependencyList.txt -d dependencies\n'
              + 'The -f and -d options default to the values above\n\n'
              + "fetchDependencies.py will by default fetch the SVN Revision of the dependencyFile's directory\n")
    oparser = optparse.OptionParser(usage = usage)
    oparser.add_option('-d', '--dependencyDir', action='store', default="dependencies", help='The directory to place checked out dependencies.')
    oparser.add_option('-f', '--dependencyFile', action='store', default="dependencyList.txt", help='The dependency definition file to read.')
    oparser.add_option('-u', '--svnusername', action='store', default='', help='The SVN Username to use for SVN checkouts. If you have cached credentials, you wont need this.')
    oparser.add_option('-p', '--svnpassword', action='store', default='', help='The SVN password to use for SVN checkouts. If you have cached credentials, you wont need this.')
    oparser.add_option('-i', '--interactive', action='store_true', default=False, help='Allows user to checkout non-local modified deps when a different dep is modified.')
    oparser.add_option('-v', '--verbose', action='store_true', default=False, help='Allows user to checkout non-local modified deps when a different dep is modified.')
    oparser.add_option('-s', '--server', action='store_true', default=False, help='Dependencies with local modifications are deleted and a fresh copy checked out.')
    oparser.add_option('-t', '--taggedFile', action='store', help='Generate a tagged version of the dependency definition file pointing at a specific svn revision for each dependency')
    oparser.add_option('-T', '--taggedFileOnly', action='store', help='Generate a tagged version of the dependency definition file pointing at a specific svn revision for each dependency. With this option the fetch is not executed.')
    oparser.add_option('-r', '--revision', action='store', default=None,
                       help="Specify the Revision to check out. (ignored if off-repo or if specified in dependency file).")
    oparser.add_option('--useDependencyListUrls', action='store_true', default=False,
                       help= "If set, use the dependency list repo address; Default is to use the project's repository")

    _set_defaults_from_environment(oparser)

    options, _ = oparser.parse_args(sys.argv[1:])

    unprinted_options = frozenset([
        'svnpassword',
    ])

    global server_mode
    server_mode = options.server

    Dependency.translate_urls = not options.useDependencyListUrls


    print 'Arguments: '
    for key in sorted(options.__dict__.keys()):
        if not server_mode and key in unprinted_options:
            print TAB, "{0: <35}{1: <10}".format(key, 10 * '*')
        else:
            print TAB, "{0: <35}{1: <10}".format(key, options.__dict__[key])
        
    if options.taggedFileOnly and options.taggedFile:
        oparser.error("--taggedFileOnly (-T) and --taggedFile (-t) are mutually exclusive. -T prevents the fetch while -t expects a fetch.")

    dependDir = options.dependencyDir
    dependFile = options.dependencyFile

    global verbose
    verbose = options.verbose

    global SVN
    if 'SVN' in os.environ:
        SVN = [os.environ['SVN']]
    else:
        SVN = ['svn']

    global interactiveMode
    interactiveMode = options.interactive
    if interactiveMode:
        print "Interactive Mode - may ask for user input!"
    else:
        SVN += ['--non-interactive', '--trust-server-cert']

    if '' != options.svnusername:
        SVN += ['--username', options.svnusername]
    if '' != options.svnpassword:
        SVN += ['--password', options.svnpassword]
    if '' != options.svnusername or '' != options.svnpassword:
        SVN += ['--no-auth-cache']

    tagged_file_path = options.taggedFileOnly

    try:
        Dependency.project_url, Dependency.target_revision, _ = _get_project_svn_info()
    except FetchException:
        if options.revision:
            # We don't have a project url, so we cannot translate urls.
            Dependency.translate_urls = False
        else:
            err_msg = ('\n\n' + STARS + '\nUnable to find the projects revision number.\n'
                        'Therefore, you must provide a revision number from the command line..\n')
            sys.stderr.write(err_msg)
            exit(1)

    if options.revision:
        Dependency.target_revision = options.revision

    if tagged_file_path:
        tag_only = True
    else:
        tag_only = False
        tagged_file_path = options.taggedFile

    raw_list = _read_dependency_file(dependFile)

    # Filter the line pointing to the master-dependencies
    master = [line for line in raw_list if line.startswith(MASTER_DIR)]
    if master and len(master) > 1:
        print "Multiple master-dependencies's defined: using %s".format(master[0])

    # Filter out the project dependencies
    dep_list = [line for line in raw_list if not (line.startswith(MASTER_DIR) or line.startswith(TESTS_BREAK))]

    if not tag_only:
        if master:
            master = [Dependency(master[0])]
            print ("\n\n" + STARS + "\n\nMaster\n\n" + STARS + "\n")

            # Fetch the master-dependencies which contains the master urls for the other dependencies
            _fetch_dependencies(master, dependDir)

            # Read the fetched master dependency file
            master_list = _read_dependency_file(os.path.join(dependDir, MASTER_DIR, "masterList.txt" ))
        else:
            print ("\n\n" + STARS + "\n\nWarning - no master-dependencies was found \n\n" + STARS + "\n")
            master_list = []
        # Assemble the list from the dependencyList.txt and masterList.txt
        assembled_list = _assemble_dependency_list(dep_list, master_list)

		# Fetch the dependencies
        print ("\n\n" + STARS + "\n\nDependencies\n\n" + STARS + "\n")
        _fetch_dependencies(assembled_list, dependDir)
        Dependency.print_updates()

    if tagged_file_path:
        print ("\n\n" + STARS + "\n\nTagging Dependencies\n\n" + STARS + "\n")
        _create_tagged_file(raw_list, dependDir, tagged_file_path)


if __name__ == '__main__':
    main()
